A Misspelling Intelligent Analysis Approach for Correcting Misspelled Words in English Text
نویسندگان
چکیده
This paper proposes an innovative MIA (Misspelling Intelligent Analysis) approach for efficient detection and intelligent correction of misspelled words. An integrity spelling correction approach is needed to consider both non-word errors and real-word errors. The MIA approach takes advantage of word frequency statistics, lexicon data, character distance and conditional probability for ranking suggestions of each misspelling having non-word errors. Drawing upon the context information, the overall score or probability is calculated and regarded as an access key for real-word errors correction in the MIA approach. Especially, features compensation and combination are provided so as to improve the accuracy of real-word errors correction in the articles of Chinese students. Finally, the experiments show that the MIA approach is capable of providing a better performance of error detection, discrimination and correction than current methods of dealing with misspelled words.
منابع مشابه
Non-word identification or spell checking without a dictionary
s is about 0.0015 while in titles it is near 0.0006. From 1980 on, the proportion of mistakes in abstracts is about 0.0002 while in titles it is less than 0.0001. As often as abstracts are reviewed for mistakes, the title of a paper is considered even more. The drop in the proportion of misspelled words in the early 1980s coincides with the widespread adoption of word processors and personal co...
متن کاملNot the Word I Wanted? How Online English Learners' Dictionaries Deal with Misspelled Words
This study looks at how well the leading monolingual English learners’ dictionaries in their online versions cope with misspelled words as search terms. Six such dictionaries are tested on a corpus of misspellings produced by Polish, Japanese, and Finnish learners of English. The performance of the dictionaries varies widely, but is in general poor. For a large proportion of cases, dictionaries...
متن کاملMaterial Development and English for Academic Purposes Word Lists; a Reductionist Approach
Nagy (1988) states that vocabulary is a prerequisite factor in comprehension. Drawing upon a reductionist approach and having in mind the prospects for material development, this study aimed at creating an English for Academic Purposes Word List (EAPWL). The corpus of this study was compiled from a corpus containing 6479 pages of texts, 2,081,678 million tokens (running words) and 63825 types (...
متن کامل字形相似別字之自動校正方法 (Automatic Correction for Graphemic Chinese Misspelled Words) [In Chinese]
No matter that learning Chinese as a first or second language, a quite important issue, misspelled words, needs to be addressed. Many studies proposed that there was a suggestion of correcting misspelled words for students who are still schooling as well as a suggestion of teaching and learning strategies of Chinese characters for teachers. Although in schooling, it does to prevent students who...
متن کاملBilingual Random Walk Models for Automated Grammar Correction of ESL Author-Produced Text
We present a novel noisy channel model for correcting text produced by English as a second language (ESL) authors. We model the English word choices made by ESL authors as a random walk across an undirected bipartite dictionary graph composed of edges between English words and associated words in an author’s native language. We present two such models, using cascades of weighted finitestate tra...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JCIT
دوره 5 شماره
صفحات -
تاریخ انتشار 2010